Efficient Imbalanced Multimedia Concept Retrieval by Deep Learning on Spark Clusters

نویسندگان

  • Yilin Yan
  • Min Chen
  • Saad Sadiq
  • Mei-Ling Shyu
چکیده

The classification of imbalanced datasets has recently attracted significant attention due to its implications in several real-world use cases. In such scenarios, the datasets have skewed class distributions while very few data instances are associated with certain classes. The classifiers developed on such datasets tend to favor the majority classes and are biased against the minority class. Despite extensive research interests, imbalanced data classification still remains a challenge in data mining research, especially for multimedia data. Our attempt to overcome this hurdle is to develop a convolutional neural network (CNN) based deep learning solution integrated with a bootstrapping technique. Considering the fact that convolutional neural networks are very computationally expensive coupled with big training datasets, we propose to extract features from pre-trained convolutional neural network models and feed those features to another full connected neutral network. Spark implementation shows promising performance of our model in handling big datasets with respect to feasibility and scalability.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Correlation-Assisted Imbalance Multimedia Concept Mining and Retrieval

In the past decades, we have witnessed an explosion of multimedia data, especially with the development of social media websites and blooming popularity of smart devices. As a result, multimedia semantic concept mining and retrieval whose objective is to mine useful information from the large amount of multimedia data including texts, images, and videos has become more and more important. The h...

متن کامل

Rule-Based Semantic Concept Classification from Large-Scale Video Collections

The explosive growth and increasing complexity of the multimedia data have created a high demand of multimedia services and applications in various areas so that people can access and distribute the data easily. Unfortunately, traditional keyword-based information retrieval is no longer suitable. Instead, multimedia data mining and content-based multimedia information retrieval have become the ...

متن کامل

Automatic Video Event Detection for Imbalance Data Using Enhanced Ensemble Deep Learning

With the explosion of multimedia data, semantic event detection from videos has become a demanding and challenging topic. In addition, when the data has a skewed data distribution, interesting event detection also needs to address the data imbalance problem. The recent proliferation of deep learning has made it an essential part of many Artificial Intelligence (AI) systems. Till now, various de...

متن کامل

Classification and automatic recognition of objects using H2o package

Deep learning (DL) is a process that consists of a set of methods which classifies the raw data to meaningful information that fed into the machine. Deep Convolutional nets composed of various processing layers to learn and represent the data. It has multiple levels of abstraction to process images, video, speech and audio. H2o deep learning architecture has many features that include supervise...

متن کامل

A Modified Grasshopper Optimization Algorithm Combined with CNN for Content Based Image Retrieval

Nowadays, with huge progress in digital imaging, new image processing methods are needed to manage digital images stored on disks. Image retrieval has been one of the most challengeable fields in digital image processing which means searching in a big database in order to represent similar images to the query image. Although many efficient researches have been performed for this topic so far, t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • IJMDEM

دوره 8  شماره 

صفحات  -

تاریخ انتشار 2017